feat: Docker runtime improvements and srsly dependency removal #58
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR significantly improves Docker runtime configuration and removes the problematic
srslydependency that was causing compilation issues with cloud native buildpacks.Problem Solved
The original issue was that
srsly(a spaCy dependency) contains C extensions that require compilation, causing failures in cloud native buildpack environments. Additionally, the CLI was hardcoded to specific paths making Docker usage difficult.Key Changes
🔧 Dependency Management
🐳 Docker Runtime Configuration
KBP_WORK_DIR- Working directoryKBP_HOME- Configuration directory (replaces hardcoded ~/.kbp)KBP_CONFIG_PATH- Custom config file locationKBP_KNOWLEDGE_BASE_PATH- Documents directoryKBP_METADATA_STORE_PATH- Metadata storage location🛠️ New Docker Tooling
scripts/docker-run.sh- User-friendly wrapper script with host networkingdocker-compose.app.yml- Production-ready compose configurationREADME-docker.md- Comprehensive Docker usage guide📁 Path Flexibility
Testing
```bash
Build and test Docker image
docker build -t knowledgebase-processor:latest .
Test with environment variables
docker run --rm -v "$(pwd):/workspace" \
-e KBP_WORK_DIR=/workspace \
-e KBP_HOME=/workspace/.kbp \
-w /workspace \
knowledgebase-processor:latest kb --help
Test wrapper script
./scripts/docker-run.sh init
./scripts/docker-run.sh scan
```
Usage Examples
Quick Start
```bash
Initialize and scan documents
./scripts/docker-run.sh init
./scripts/docker-run.sh scan
With custom directory
./scripts/docker-run.sh -w ~/Documents init
Continuous monitoring with host network access
./scripts/docker-run.sh publish --watch
```
Docker Compose
```bash
Interactive mode
docker-compose -f docker-compose.app.yml up kbp
Watch mode with Fuseki
docker-compose -f docker-compose.app.yml up fuseki kbp-watch
```
Benefits
✅ No more compilation issues - Cloud native buildpacks now work
✅ Flexible Docker deployment - Works with any directory structure
✅ Host network connectivity - Can access local SPARQL endpoints
✅ Configurable paths - Adapts to different runtime environments
✅ Production ready - Complete tooling and documentation
Breaking Changes
Migration Guide
For users currently using spacy/entity recognition:
For Docker users:
🤖 Generated with Claude Code